荷兰专利NL1013089A1 Scalable encoding / decoding methods and still picture producing apparatus using wavelet transform.

专利PDF首页>>荷兰专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
Shape information is scalably encoded and decoded by interleaved scan line (ISL) and raster scan line (RSL) methods and the encoded and decoded shape information is used for encoding texture information. The shape information of a chrominance (UV) component is encoded to compensate for the chrominance (UV) component. The encoding method is independently applied to the shape and texture component of each block. A scalable encoder of a still image using wavelets compresses pixels by using the characteristics between ISL pixels in a shape of the layer to be encoded or pixels between two layers in encoding the shape between layers. It is therefore possible to restore the shape and texture at high speed by performing the scalable encoding according to resolution, such as for searching for an image in a database/library. Also, tile coding a large image field to independently restore desired parts is fast and efficient.
公开号:NL1013089A1
申请号:NL1013089
申请日:1999-09-17
公开日:2000-03-21
发明作者:Dae-Sung Cho；Se-Hoon Son；Jae-Seob Shin
申请人:Samsung Electronics Co Ltd；
IPC主号:

专利说明:

Scalable Encoding / Decoding Methods and Still Image Producer Using Wavelet Transform
BACKGROUND OF THE INVENTION
1. Field of the invention
The present invention relates to scalable encoding / decoding methods and apparatus used in a still picture encoder using wavelet transform.
2, Description of the related art
In a conventional shape information encoding method used in a still picture encoder, pixel information of shapes of all layers that are executed in a wavelet sharing process must be encoded. In that case, when a scalable encoding method is used, the number of pixels to be encoded increases remarkably compared to the case where all shape information is encoded directly. Consequently, the efficiency of encoding decreases. Also, a system becomes more complicated as the number of pixels to be encoded increases. When the size of an input image is large, this effect is more pronounced. Consequently, it takes a long time to restore a complete picture.
SUMMARY OF THE INVENTION
It is an object of the present invention to provide a scalable still image encoding method and an apparatus for dividing a still image into blocks, wherein the divided blocks are classified according to the possibility of using exclusive OR information of each pixel, and encoding the blocks according to classified encoding modes so that arbitrary shape information can be efficiently encoded by a wavelet-based still image encoder.
It is another object of the present invention to provide a decoding method and apparatus corresponding to the wavelet-based still picture scalable encoding method and apparatus.
It is yet another object of the present invention to provide still picture encoding and decoding methods of tiling an input image of any shape and recovering part of a user-compressed data independently. desired image without a large amount of calculations.
Thus, in order to achieve the first object, a method of scalable encoding of still image shape information using a wavelet transform is provided, which comprises the steps of wavelet transforming and scalable encoding of luminance shape information (Y ) component, wavelet encoding texture information about the luminance (Y) component using the shape information about the wavelet transformed luminance (Y) component, filling in shape information and texture information about a chrominance ( UV) component using the chrominance (UV) component shape information, the wavelet transform and scalable encoding of the chrominance (UV) component filler shape information, and the wavelet encoding the chrominance (UV texture texture information) ) component using the shape information about the wavelet transformed chrominance (UV) component.
In a method of scalable encoding of still image shape information using a wavelet transform according to the present invention, the steps of scalable encoding of the shape information on the luminance (Y) component and scalable encoding from the filler shape information about the chrominance (UV) component, each the steps of obtaining respective layers by form-fitting discretely transforming input shape information, coding the low-frequency bandwidth shape information of the bottom shape layer, scalable coding of the low-frequency band- latitude shape information of each layer using the low frequency bandwidth shape information of the bottom layer, with respect to each of the shape layers, except the bottom shape layer, and transmitting the coded shape information from the bottom layer to the top layer.
In a method of scalable encoding of still image shape information using a wavelet transform according to the present invention, the phase of scalable encoding of the low frequency bandwidth shape information of each layer comprises the steps of blocking the low frequency bandwidth shape information of the current layer and the low frequency bandwidth shape information of lower layers, framing the respective blocks in the shape information, and determining the encoding mode, performing arithmetic encoding on the determined encoding mode, and encoding the framed block according to the certain encoding modes, with respect to each of the framed blocks.
In a method of scalable encoding of still image shape information using a wavelet transform according to the present invention, when a 1 x 1 pixel value P, of a binary alpha block (BAB) f, (i, j) of a lower layer corresponds to 2x2 pixel values P0, P1 # P2 and P3 of a (BAB) f2 (i, j) of the current layer, the encoding mode is determined to be an interleaved scan line (ISL) mode when all of the The following conditions are met with respect to all pixels in the BAB of the lower layers, and the encoding mode is determined to be a raster scan line (RSL) mode if all of the following conditions are not met.
conditionl = (£, (2 /, 2 /) = = £, (/,;)) condition2 =! (! (£, (2 /, 2;) £ 9, (2 / -2, 2;) && (£, (2 / + 1, 2j)! = £ 2 (2 /, 2 /)) condition3 = ! (! (£ 2 (2 /, 2j) Θ £ (2 /, 2 / -2) && (£, (2 /, 2 / -1)! = £ 2 (2 /, 2 /))
Condition4 =! (! (£ 2 (2M, 2f) Θ £, (2 / + 1, 2 / -2) && (£, (2 / + 1, 2 / + 1)! = £ 2 (2 / -1, 2 /)
In a method of scalable encoding of still image shape information using a wavelet transform according to the present invention, when the encoding mode is the ISL encoding mode, with respect to each pixel of the block, the phase of encoding each framed block consists of the steps of not encoding P () when the pixel value to be encoded is P0, calculating context information showing the arrangement of pixels of the current layer around the pixel to be encoded and a probability value to perform arithmetic coding on the pixel to be coded only when the left and right pixel values of the pixel value to be coded differ from each other and perform the arithmetic coding on Pj when the pixel value to be coded is PL, and calculating the context information showing the arrangement of pixels of the current layer around the pixel to be encoded and the probability value to calculate the Only carry out coding on the pixel to be coded if the pixel values above and below the pixel value to be coded differ from each other and to perform the arithmetic coding on P2 or P3 when the pixel value to be coded is P2 or P3.
In a method of scalable encoding of still image shape information using a wavelet transform according to the present invention, when the encoding mode is the RSL mode with respect to each pixel of the block, the phase of the encoding each framed block the steps of not encoding P0 when the pixel value to be encoded is P0 and the corresponding PL is 0, calculating the context information representing the ordering of pixels of the current layer and lower layers around the encoding pixel and the probability value for performing the arithmetic coding on the pixel to be coded and performing the arithmetic coding P0 when the pixel value to be coded is P0 and the corresponding pixel value PL is not 0, and calculating the context information which shows the arrangement of the pixels of the current layer and the lower layers around the pixel to be encoded and the value incidence value for performing the arithmetic operation on the pixel to be encoded and performing the arithmetic operation on Pj, P2 or P3 when the pixel value to be encoded is P1, P2 or P3.
A method for scalable encoding of shape information about a still image using a wavelet transform is provided, which consists of the steps of wavelet transforming the shape information over the luminance (Y) component by uniform symmetry wavelet filter and scalable encoding of the luminance (Y) component shape information, wavelet encoding of the luminance (Y) component texture information using the wavelet transformed luminance (Y) component shape information, and wavelet encoding the chrominance (UV) component texture information using the wavelet transformed luminance (Y) component shape information.
In a method for scalable encoding of still image shape information using a wavelet transform according to the present invention, the stage of filling the shape information and texture information about the chrominance (UV) component comprises the steps of obtaining downsampled shape information of shape information on the luminance (Y) component to compensate for the chrominance (UV) component of 4: 2: 0 or 4: 2: 2, dividing the downsampled shape information into blocks according to the number of layers and expanding the shape information to an area containing all pixels of framing blocks, which are partly in that shape, and obtaining texture information corresponding to the extension area by filling in the texture information about the chrominance (UV) component in horizontal and vertical direction.
To achieve the second purpose, a method of scalable decoding encoded shape information on a still image using wavelet transformation is provided, which comprises the steps of scalable decoding and wavelet transforming the encoded shape information on the luminance (Y) component, wavelet decoding the encoded luminance (Y) component texture information using the shape information about the wavelet transformed luminance (Y) component, scalable decoding and wavelet transforming the encoded chrominance shape information (UV) component, and wavelet decoding the encoded chrominance (UV) component texture information using the wavelet transformed chrominance (UV) component shape information.
In a method for scalable decoding of shape information on a still image using the wavelet transform of the present invention, the steps of wavelet transforming the encoded shape information on the luminance (Y) component and wavelet transforming the encoded chrominance (UV) component shape information each from the steps of receiving coded shape information from the bottom layer to the top layer, wherein the low frequency bandwidth shape information of the bottom layer is obtained by decoding the coded shape information from the bottom layer, the scalable decoding of the low-frequency bandwidth shape information by decoding the coded shape information of each layer using the low-frequency bandwidth shape information of lower layers with respect to the respective layers except the bottom layer, and obtaining the respective Each layer by form-fitting discretely wavelet transforming the low-frequency bandwidth shape information of the decoded respective layers.
In a method of scalable decoding of encoded shape information on a still image using wavelet transform according to the present invention, the phase of scalable decoding of the low-frequency bandwidth shape information consists of the steps of receiving encoded shape information and blocking dividing the shape information of the current shape layer and the shape information of the lower layers where the respective blocks are framed in the shape information, and performing arithmetic decoding on the encoding modes of the respective frame blocks and decoding the encoded shape information in each block accordingly the decoded encoding mode.
In a method of scalable decoding of encoded shape information about a still image using the wavelet transform of the present invention, when a 1 x 1 pixel value PL of a binary alpha block (BAB) f 1 (i, j) of a lower layer corresponds to 2x2 pixel values P0, Pj, P2 and P3 of a BAB f2 (i, j) of the current layer, the encoding mode is determined to be an interleaved scan line (ISL) mode when all of the following conditions are met with respect to to all pixels in the BAB of the lower layers and the encoding mode is determined as a raster scan line (RSL) mode when some of the following conditions are not met.
conditionl = (f2 (2 /, 2 /) = = /, (/, /)) condition! =! (! (/ ^ (2 /. 2 /) 0 /, (2 / + 2, 2y) && (/, (2 / + 1, 2y)! = /, (2 /, 2 /))
Condttion3 =! (! (/, (2 /, 2 /) 0 /, (2 /, 2 / + 2) && (/, (2 /, 2 / -1)! = /, (2 /, 2 /))
Condition4 =! (! (/, (2 / + 1, 2j) Θ /, (2 / + 1, 2 / + 2) && (/, (2 / + 1. 2 / + 1)! = /, (2 / +) 1.2 /)
In a method of scalable decoding coded shape information of a still image using wavelet transform according to the present invention, when the coded mode is the ISL coding mode with respect to each pixel of the block, the phase of decoding the encoded shape information in each block from the steps of restoring P () by P, when the pixel value to be decoded is P (), restoring P0 by the pixel value to the left or right of the pixel value to be encoded when the pixel value P to be encoded and the pixel values to the left and right of the pixel value are equal to each other, and calculating the context information that arranges the pixels of the current layer around the pixel to be encoded and the probability value for executing the arithmetic decoding on the pixel to be decoded and performing the arithmetic decoding on PI when the pixel value to be decoded is PI t and the pixel values to the left and right of the pixel values to be decoded differ from each other, and the decoding of P, or P3 by the pixel value above or below the pixel value to be encoded when the pixel value to be decoded P or P; j and the pixel values above or below the pixel value are equal to each other and calculating the context information showing the arrangement of the pixels of the current layer around the pixel to be encoded and a probability value for performing an arithmetic decoding on the pixel to be decoded and performing arithmetic decoding on P, or P3 when the pixel value to be encoded is P2 or P3 and the pixel values above and below the pixel value are different.
In a method of scalable decoding encoded still image shape information using wavelet transform according to the present invention, when the encoding mode is the RSL encoding mode with respect to each pixel of the block, the phase of the decoding the encoded shape information in each block from the steps of restoring P0 by 0 when the pixel value to be decoded is P0 and the corresponding PL is 0, calculating the context information showing the arrangement of the pixels of the current layer and the lower layers around the pixel to be encoded and the probability value for performing the arithmetic decoding on the pixel to be decoded and performing the arithmetic decoding on P0 when the pixel value to be decoded is P0 and the corresponding PL is not 0, and the calculate the context information representing the arrangement of pixels of the current layer and the lower layer n around the pixel to be decoded and the probability value for performing the arithmetic decoding on the pixel to be decoded and performing the arithmetic decoding on PJ (P or P3 when the pixel value P}, P to be decoded or P3.
To accomplish the third object, a device for scalable encoding of shape information on a still image using wavelet transformation is provided, which consists of a shape information scalable encoder for wavelet transform and scalable encoding of the shape information of a luminance (Y) component and a chrominance (UV) component, a chrominance (UV) image / texture filler unit for filling shape information and texture information of a chrominance (UV) component using a luminance (Y) component and texture information of a chrominance (UV) component with respect to 4: 2: 0 or 4: 2: 2 shape information, and a texture information wavelet encoder for wave-let encoding the texture information of the luminance (Y) component and the chrominance (UV) component using the shape information wavelet transformed by the shape information scalable encoder.
In an apparatus for scalable encoding of still image shape information using wavelet transform according to the present invention, the shape information scalable encoder comprises a luminance (Y) shape scalable encoder for wavelet transforming and scalable encoding of the shape information of the luminance (Y) component and a chrominance (UV) shape scalable encoder for wavelet transforming and scalable encoding of the shape information of the chrominance (UV) component filled by the chrominance (UV) image / texture filler.
In a device for scalable encoding of shape information on a still image using the wavelet transform of the present invention, the luminance (Y) shape scalable encoder and the chrominance (UV) shape scalable encoder each include a number-adaptive discrete wavelet encoder. transformers for receiving shape layers and generating the shape layers of lower layers, a shape encoder for encoding the low-frequency bandwidth shape information of the bottom shape layer, a number of scalable coders for scalable encoding of the low-frequency bandwidth shape information of the respective layers using the low-frequency bandwidth shape information of the lower layers with respect to the respective shape layers, except for the bottom shape layer, and a multiplexer which serves to transmit the encoded shape information from the bottom layer to the upper layers.
In an apparatus for scalable encoding of still image shape information using wavelet transform according to the present invention, each scalable decoder comprises means for storing the low frequency bandwidth shape information of the current layer and the low frequency bandwidth shape information of the lower layers. divided into blocks, a means for framing the respective blocks in the shape information, a means for determining the coding mode according to the possibility of using exclusive OR information on each pixel in the framed block, a means for scanning the respective pixels in a block in the ISL order and omitting the coding of the pixels when exclusive OR information can be used and obtaining the context information and performing the arithmetic coding on the pixels when the exclusive OR information cannot be used when the encoding mode is the ISL encoding mode, and scanning the respective pixels in a block in the RSL order, obtaining the context information and performing the arithmetic encoding on the pixels when the encoding mode is the RSL encoding mode is.
To accomplish the third object, there is provided here a device for scalable decoding encoded shape information on a still image using wavelet transform, which includes a shape information scalable decoder for scalable decoding and wavelet transforming the encoded shape information. on the luminance (Y) component and the chrominance (UV) component and a texture information wavelet decoder for wavelet decoding encoded texture information on the luminance (Y) component and the chrominance (UV) component using of the shape information is wavelet transformed by the shape information scalable decoder.
In an apparatus for scalable decoding of encoded shape information on a still picture using wavelet transform according to the present invention, the shape information scalable decoder comprises a luminance (Y) shape scalable decoder for scalable decoding and wavelet transform of the encoded shape information on the luminance (Y) component and a chrominance (UV) shape scalable decoder for scalable decoding and wavelet transforming the encoded shape information on the chrominance (UV) component.
In a device for scalable decoding of encoded shape information on a still image using the wavelet transform of the present invention, the luminance (Y) shape scalable decoder and the chrominance (UV) shape scalable decoder each include a demultiplexer to distributing coded shape information from the bottom layer to upper layers, a shape decoder for obtaining the low frequency bandwidth shape information of the bottom layer by decoding the coded shape information of the bottom shape layer, a number of scalable decoders for scalable decoding of the low-frequency bandwidth shape information by decoding the coded shape information of the respective layers using the low-frequency bandwidth shape information, with respect to the respective shape layers except the bottom shape layer, and a plurality of shape-adaptive discrete wavelet transmission formers for obtaining each of the shape layers by form-fitting discretely wavelet transforming the decoded low-frequency bandwidth shape information of the respective layers.
In a device for scalable decoding of encoded shape information on a still picture using wavelet transform according to the present invention, each scalable decoder comprises means for receiving encoded shape information and the shape information of the current layer and the shape information of the lower layers divided into blocks, means for framing the respective blocks in the shape information, means for performing arithmetic decoding on the coding mode determined according to the possibility of using the exclusive OR information of the respective pixels in the framed block, a means for scanning the respective pixels in a block in the ISL order decoding and decoding the pixels by exclusive OR information when the exclusive OR information can be used and obtaining the context information and performing the arithmetic the decoding on the pixels when the exclusive OR information cannot be used, when the encoding mode is the ISL encoding mode, and a means for scanning the respective pixels in a block, obtaining the context information, and outputting arithmetic decoding on the pixels, when the encoding mode is the ISL encoding mode.
To achieve the third object, a method for scalable still image coding using wavelet transformation is provided, which consists of the steps of tiling an input object of arbitrary shape of uniform size and classifying a control component, encoding a control signal with respect to each tile, wavelet transforming shape and texture information, scaling the values of the respective layers scalably, and encoding object information into a tile, with respect to each tile , and sequentially connecting coded bitstreams to each tile.
To achieve the third object, a method of decoding a bitstream obtained by scalable encoding of a still image using wavelet transform, which comprises the steps of receiving an encoded bitstream, is provided. dividing the encoded bitstream into objects and classifying a control component of a number of tile components into bitstreams with respect to the respective objects, decoding the control component, scalable decoding of shape and texture information with respect to each tile component, assembling the object information items decoded with respect to the respective tile components using the decoded control component in each object, and assembling a number of object information items on a screen.
To achieve the third object, a device for scalable still image coding using wavelet transformation is provided, comprising one or more tile dividers that divides an input object of any shape into tiles of uniform size and classifying control components, one or more control signal encoders for encoding control components classified by the tile distributors, a number of image encoders for receiving tiles distributed by the tile distributors, wavelet transformation shape and texture information the tiles, and scalably encoding the values of the respective layers, and a multiplexer for sequentially connecting encoded bitstreams to the respective tiles.
To achieve the third object, there is provided a bit stream decoding apparatus obtained by scalable coding of a still image using wavelet transform, comprising: a demultiplexer for receiving the coded bit stream, dividing the bitstream into objects, and classifying a control component and a number of tile components in the bitstream with respect to each object, one or more control signal decoders for decoding the control component, a number of still picture decoders for receiving a tile component and scalable decoding of shape and texture information in the tile, one or more tile assemblers for assembling the decoded tile component in each object, and an object assembler for assembling some object information items put together on a screen by the tile builder.
BRIEF DESCRIPTION OF THE DRAWINGS
The foregoing objects and advantages of the present invention will become more apparent by detailed description of a preferred embodiment thereof with reference to the attached drawings, in which: Figures 1A and 1B are block diagrams showing the construction of a still picture encoder and a still picture decoder. show using wavelet transform; FIG. 2 is a flow chart showing area-based padding processes; Figures 3A and 3B show a method for spreading shape information to blocks and inserting the shape information in units of an area, respectively; Fig. 4 is a block diagram showing the construction of a scalable shape encoder using wavelet transform; FIG. 5 is a block diagram showing the construction of a scalable shape decoder using wavelet transform; FIG. 6 describes a method of forming an image pyramid on which a size wavelet is applied; Fig. 7 shows three-layer scalable shape coding; Figures 8A and 8B describe methods for scalable encoding / decoding in units of a binary alpha block; Figures 9A, 9B and 9C show the framework of binary alpha blocks; FIG. 10 describes the conditions for determining the mode of encoding a binary alpha block; FIG. 11 describes a method for interleaved scan line (ISL) encoding a binary alpha block; FIG. 12 is a flow chart showing a method of ISL decoding a binary alpha block; Fig. 13 is a flow chart showing a method of raster scan line (RSL) encoding a binary alpha block; FIG. 14 is a flow chart showing a method of RSL encoding a binary alpha block; 15A and 15B show the sequences in which the pixels of a block are encoded in an ISL encoding mode and an RSL encoding mode, respectively; Figures 16A, 16B and 16C show context information for binary arithmetic coding; 17A and 17B are block diagrams showing an object-based still picture encoder and an object-based still picture decoder, respectively, each using a tile operation; Fig. 17C shows a randomly shaped object in a tile structure; 18A to 18F show syntaxes of bitstreams compressed by the still picture encoder of the present invention; Figures 19A and 19B show recovered images (using an odd symmetry filter) in a layer 3 of an image of children, the shape information of a chrominance (UV) component in Figure 19A not corrected and the shape information of a chrominance (UV) component in Fig. 19B has been corrected; and Figures 20A and 20B show recovered images (using an odd symmetric filter) in a layer 3 of an image of Fish & a logo, the shape information of the chrominance (UV) component in Figure 20A not corrected, and the shape information of the chrominance (UV) component in Fig. 20B has been corrected.
DESCRIPTION OF THE PREFERRED EMBODIMENTS In the following, the present invention will be described in detail with reference to the attached drawings.
Referring to Fig. 1A, an embodiment of a still picture encoder using wavelet transform of the present invention contains a luminance (Y) (Y) shape scalable encoder 103, a (Y) (Y) texture wavelet. encoder 104 using wavelet transform, a chrominance (UV) (UV) image / texture filler 106, a chrominance (UV) (UV) shape scalable encoder 107, a chrominance (UV) (UV) texture wavelet encoder 108 which using wavelet transform and a multiplexer 109. Referring to Fig. 1B, an embodiment of a still picture decoder using wavelet transform for recovering an image from an encoded bitstream contains a luminance (Y) (Y) shape scalable encoder 113, a luminance (Y) (Y) texture wavelet decoder 114, a chrominance (UV) shape scalable decoder 116, and a chrominance (UV) texture wavelet decoder 117.
In a 4: 2: 0 or 4: 2: 2 color image, filling in shape and texture information is necessary to solve the problem of color fading in any scalable layer. Referring to Fig. 2, filling in the shape and texture information consists of processes of obtaining shape information from a chrominance (UV) component by downsampling shape information of a luminance (Y) component, adding the obtained shape information of the chrominance (UV) component to blocks, and repeatedly filling the shape information added to blocks horizontally and vertically using the shape information of the chrominance (UV) information, the added shape information of the chrominance (UV) information, and texture information of the chrominance (UV) information.
Referring to Fig. 4, a scalable shape encoding process consists of processes of wavelet transforming the shape information, sequentially encoding the transformed low frequency band shape information of each layer of low resolution shape information, and outputting a bit stream. The process of scalable encoding of the shape information executed in units of a block includes the steps of receiving framed binary alpha blocks (BAB) shown in Fig. 9A, in accordance with the possibility of using exclusive OR information it is determined for each pixel in a block which encoding mode is used, obtaining context information of pixels in a block and performing arithmetic encoding on the pixels in a block according to the determined encoding mode. When the determined encoding mode is an interleaved scan line (ISL) encoding method, the context information of the pixels in a block is obtained and arithmetic encoding pixels in a block are obtained in an interleaved scan line sequence as shown in FIG. 12. When the coding mode is a raster scan line (RSL) coding method, the context information of pixels in a block is obtained and the arithmetic coding on the pixels in a block is performed in a raster scan line sequence as shown in Fig. 14 .
Referring to Fig. 5, a scalable shape encoding process consists of processes of receiving an encoded bitstream, sequentially scalable encoding bitstream from a base layer, obtaining low-frequency shape information of each layer, and obtaining wavelet-transformed shape information and a recovered shape for recovering texture information from the recovered low frequency bandwidth. A process of scalable decoding the binary alpha block that is performed in units of a block includes the steps of reconstructing the encoded input bitstream shown in Fig. 9A into the framed binary alpha block (BAB), obtaining the context information of the pixels in a block and performing arithmetic decoding on the pixels in a block according to the encoding mode. When the encoding mode is the ISL encoding method, the context information of the pixels in a block is obtained in the ISL order, and arithmetic decoding is performed on the pixels in a block, as shown in Fig. 13, whereby the pixels are restored. When the coding mode is the RSL coding method, the context information of the pixels in a block is obtained and arithmetic coding on the pixels in a block is performed in the RSL order as shown in Fig. 14.
With regard to a large input image, it can be mentioned that when a user wishes to recover not the entire image, but a specific portion of an image, not by the complete coded data, but by some of the coded data with a small amount of computation , it is necessary to divide the shape information and texture information into tiles and encode and decode the respective tiles independently.
Fig. 17A is another embodiment of a scalable encoder for producing a still image using the wavelet transform of the present invention, showing the construction of an objeet-based still image encoder using a tile operation, to divide the still image into tiles and encode the divided tiles using the scalable encoder of the still image shown in Fig. 1A. Referring to Fig. 17A, a number of tile dividers 1701 and 1711 divide one or more input objects 1700 and 1710 into tiles. Control signal encoders 1702 and 1712 encode the control signal generated by tile distributors 1701 and 1711. The dimensions of tiles distributed by tile distributors 1701 and 1711 are divided by 2, the remainder being 0. Also, the size of tiles distributed by tile dividers 1701 and 1711 is divided by 2 x {N + 1) in horizontal and vertical direction when the number of layers of wavelet transform for resolution scalability is N. Scalable coders 1703 and 1713 shown in Figure 1A encode tiles (tile 0, 1, ..., and tile M-1 or tile 0, tile 1, ..., and tile N1) that are divided by the corresponding tile- dividers 1701 and 1711. Encrypted bitstreams of each input object are sequentially connected by lower multiplexers 1704 and 1714. An upper multiplexer 1720 obtains an encoded bitstream 1730 with respect to all input objects and transmits the encoded bitst cream.
Fig. 17B shows inverse processes shown in FIG. 17A. The structure of objeet-based still picture decoder including the tile operation of decoding the specific portion of an image with a small amount of computation using a portion of the encoded bitstream is shown. Referring to Fig. 17B, an overhead demultiplexer 1740 divides the received bitstream 1730 into coded objects. A number of lower multiplexers 1750 and 1760 divide the bitstream of each object into a control signal component and a number of tile components. Control signal decoders 1751 and 1761 receive the control signal component from the lower multiplexers 1750 and 1760, respectively, and decode the received control signal components. A number of scalable encoders 1752 and 1762 shown in Figure 1B receive the components of the lower multiplexers 1750 and 1760, respectively, and decode the received tile components. A number of tile assemblers 1753 and 1763 reconstruct a corresponding object using tile components (tile 0, tile 1, ..., and tile M1, or tile 0, tile 1, ..., and tile Nl) and a control component from the corresponding scalable encoders 1752 and 1762. An object builder 1780 assembles a number of objects composed by tile assemblers 1753 and 1763 to obtain a final output image 1790.
The operating principle of the present invention will be described below.
Fig. 1A shows the construction of a still image encoder using wavelet transformation.
As shown in Fig. 1A, when Shape_enable 102 is on, shape information of the luminance (Y) component of an input image 101 is scalably encoded and texture information of the luminance (Y) component is encoded in a wavelet domain using the shape information wavelet transformed by the luminance (Y) shape scalable encoder 103. When Shape_enable 102 is off, only texture information is encoded, without the shape information.
The texture information about the chrominance (UV) component is always encoded. It is determined whether the shape information of the chrominance (UV) component is encoded by a condition; Shape_enable & Chroma_shape_ena-ble 105. When the condition is met, the shape information and texture information of the chrominance (UV) component is filled and the filler shape information of the chrominance (UV) component is scalably encoded. The condition Shape_enable & Chroma_shape_enable 105 is met when the shape information of the chrominance UV component must be encoded since the input image has an arbitrary shape and the wavelet filter included in the scalable encoders 103 and 107 is an uneven symmetry filter . When the wavelet filter included in the scalable coders 103 and 107 is an equal symmetry filter, it is not necessary to encode additional shape information as it is possible to obtain the shape information of the chrominance (UV) component from the shape information of the luminance (Y) component of each layer.
An encoded bitstream 110 is restored as shown in Fig. 1B. When the encoded bitstream 110 is input through a demultiplexer 111, the shape information of the luminance (Y) component is scalably decoded according to a condition; Shape_enable 112, and the texture information about the luminance (Y) component in the wavelet domain is decoded using the shape information of the decoded luminance (Y) component. If the Shape_enable 112 condition is not met, only the texture information is decoded, without the shape information.
The texture information of the chrominance (UV) component is always decoded. It is determined whether the chrominance component shape information is decoded by a condition; Shape_enable & Chroma_shape_enable 115. When the condition is met, the shape information of the chrominance (UV) component is scalably decoded. Decoded shapes of the respective layers are used for decoding the texture information. The Shape_enable & Chroma_shape_enable 115 condition is the same as the Shape_enable & Chroma_shape_enable 105 condition.
Fig. 2 shows processes of filling up the shape information and texture information of the chrominance (UV) component of FIG. 1A. When an original image 201 is input, the shape information of the luminance (Y) component 4: 1 is downsampled (step 202) and the downsampled shape information is added to blocks (step 204). The length of one side of a block B = 2 (scaljevd -1). Scal_level indicates the number of scalable layers. Fig. 3A shows an example of addition to blocks. Downsampled shape information 301 in Fig. 3A is extended by an area 302 divided into blocks. In the case of the extended shape information on the chrominance (UV) information, there is no texture information in an area between the downsampled shape information 301 and the added area 302. In order to compensate for this, as shown in Fig. 2, area-based horizontal and vertical fill (steps 206 and 207) realized using the shape information of the chrominance (UV) component extended in step 204, the shape information of the chrominance (UV) component 203 that was downsampled in step 202, and the texture information about the input chrominance (UV) component 205. Thus, padded UV shape information and texture information 208 is obtained.
The padding is performed to compensate for a position where there is no texture information by using the texture information of an adjacent position. Referring to Fig. 3B, when there is an original chromainance (UV) component image 303 and an expanded image 304, the texture information of the area B 306 is filled horizontally and vertically using the texture information of the framing of the area A 305 which is shared by the two areas. The reference of the texture information is shown by arrows in Fig. 3B. The process of repeated filling in a horizontal direction is detailed as follows;
In addition, ref_shape [] [], d [] [], and hor_pad [] Π indicate an expanded shape information value, shape information downsampled from the shape information of the chrominance (UV) information, texture information, and an image value obtained, respectively. after realizing padding in the horizontal direction, x 'indicates the effective pixel position in a vertical direction, y' indicates the effective pixel position (s [y] [x '] == D closest to the left of the current position x is. x '' indicates the effective pixel position closest to and to the right of the current position x. M and N indicate the width and height of an image.
When a pixel value is present only on the left (or right) side of the current position, the value is used as the pixel value of the current position. When pixel values exist on the left and right sides of the current pixel, the average value of the two values is used as the pixel value of the current position. The process of repeated filling in a vertical direction is as follows.
In addition, s' [] [] and hv_pad [] [] represent the shape information obtained by expanding the shape information obtained by downsampling the shape information of the luminance (Y) component in a horizontal direction and the image value obtained after performing padding in a vertical direction, y 'indicates the effective pixel position (s' [y1] [x] == 1) nearest and above the current position y. y indicates the effective pixel position that is nearest and below the current position y.
Fig. 4 shows the structure of the scalable encoder of the shape information using wavelet transform. An input image 401 takes a shape pyramid from the layers 402, 404, 406 and 408, respectively, via shape-matching discrete wavelet transforms (SA-DWT) 403, 405 and 407). The wavelet transformed shape information items 404, 406 and 408 of the respective layers are input to the wavelet encoders 104 and 108 of Fig. 1A and are used to encode the texture information from each layer. Low-frequency bandwidth shape information 409 of the bottom layer 408 is encoded via a general shape encoder 410. A context-based arithmetic encoder (CAE) can be used as the shape encoder. Low frequency bandwidth shape information items 412 and 415 of the shape layers 406 and 404 except the top layer and the bottom layer and the topmost shape information item 402 are encoded by scalable encoders 413, 416 and 418 provided in the present invention using of the low-frequency bandwidth form information items 409, 412 and 415 of the lower layers of the respective layers. Bitstreams 411, 414, 417 and 419 encoded in the respective layers are formed into a bitstream 421 from the bottom layer to the top layer by a multiplexer 420 and the bitstream 421 is sent to a channel.
Fig. 5 shows the structure of the scalable decoder of the shape information using wavelet transform. An encoded bitstream 501 is divided into bitstreams of lower layers and bitstreams of upper layers. The bottom layer bitstream 503 is used to obtain a low frequency bandwidth shape 505 through a general shape decoder 504. Bitstreams 507, 511 and 515 of upper layers are used to extract low frequency bandwidths 509 and 513 of the respective layers or shape information 517. the upper layer obtainable via scalable decoders 508, 512 and 516. The scalable decoders 508, 512 and 516 of the respective layers receive the coded bitstreams corresponding to the respective layers and low frequency bandwidth shape information items 505, 509 and 513 of lower layers . In order to restore wavelet-transformed shape information in each layer, the low-frequency bandwidth shape information items 509 and 513 of the upper layers of the respective layers and the shape information 517 of the topmost layer are adapted wavelet-transformed 518, 519 and 520 and are restored shapes 506, 510 and 514 of the respective layers obtained using LL, LH, HL and HH bandwidth information items from the SA-DWTs 518, 519 and 520. The restored shapes 506, 510 and 514 of the respective layers are input in the texture wavelet decoders 114 and 117 of Fig. 1B and are used to restore texture components.
Fig. 6 describes processes of forming the pyramid of shape information using one-dimensional wavelet transform and inverse processes. The method of one-dimensional distribution of the shape information varies according to the type of wavelet transform filter. When the wavelet transform is an uneven symmetry filter, the even-numbered pixel values of an input signal are sampled with a low-frequency bandwidth, and odd-numbered pixel values are sampled with a high-frequency bandwidth. When pixel length shape information is received and the pixel is oddly numbered, the low frequency bandwidth pixel is swapped with the high frequency bandwidth pixel.
When the wavelet transform filter is an even symmetry filter, the even-numbered pixel values of the received signal are sampled with a low-frequency bandwidth, and the odd-numbered pixel values are sampled with a high-frequency bandwidth, such as in the odd symmetry filter. When the starting point of a segment where a successive pixel value is 1 is oddly numbered, the high frequency bandwidth signal at the starting point is swapped with the low frequency bandwidth signal. Namely, an effect of performing an OR operation between low-frequency information items of the respective layers is created.
An input image 601 is divided into a low-frequency (L) bandwidth and a high-frequency (H) bandwidth by performing a one-dimensional transformation 602 in a vertical direction. When a transformation 603 is performed in a horizontal direction in each bandwidth, the low-frequency (L) bandwidth is divided into a low-frequency-low-frequency (LL) bandwidth and a high-frequency-low frequency (HL) bandwidth. The high-frequency (H) bandwidth is divided into a low-frequency-high-frequency (LH) bandwidth and a high-frequency-high-frequency (HH) bandwidth. Consequently, the input image is divided into four bandwidths. When this process is repeatedly performed in the low-frequency-low-frequency (LL) bandwidth, a pyramid structure 606 of an image is obtained. It is possible to obtain a four-bandwidth image by outputting a transformation 604 in a horizontal direction and then outputting a transformation 605 in a vertical direction.
Fig. 7 shows three-layer scalable shape encoding processes with respect to a 4: 2: 0 format image. In Fig. 7, reference numerals 701 and 702 show the wavelet pyramid image of a luminance (Y) component divided into three layers and the wavelet pyramid image of a chrominance (UV) component divided into two layers, respectively. The number of layers of the chrominance (UV) component image is one less than the number of layers of the luminance (Y) component image since the size ratio of the chrominance (UV) component image to the luminance (Y) component image 4 : 1.
It is possible to obtain LL bandwidth images 703, 704, 705 and 706 of the respective layers from the wavelet pyramid image 701 of the luminance (Y) component. It is also possible to obtain LL bandwidth images 710, 711 and 712 of the respective layers from the wavelet pyramid image 702 of the chrominance (UV) component. A scalable encoder and decoder sequentially encodes and decodes the LL bandwidth shape information from the lower layer. The lowest shape information items 703 and 710 of the respective pyramids are encoded by a general shape encoder. The shape information items 704, 705, 706, 711, and 712 of the upper layers are scalably encoded using the shape information items 703, 704, 705, 710, and 711 of the lower layers, which are referred to as reference numbers 707, 708, 709, 713 and 714.
When the wavelet transform filter is an even symmetry filter, wavelet transforms 707, 708, 709, 713 and 714 between the layers can be expressed by an OR operation. Therefore, when the shape information 712 about the chrominance (UV) component is down-sampled via the OR operation of the shape information 706 of the top layer of the luminance (Y) component, the shape information items 703, 7 04 and 705 of the luminance (Y) component of the respective layers equal to the shape information 710, 711 and 712 on the chrominance (UV) component. The chrominance (UV) components downsampled in the ratio 4: 1 to the luminance (Y) components of the respective scalable layers are correctly paired one by one to the luminance (Y) components. Consequently, no visual problem is caused. In this case, scalable shape coding with respect to the chrominance (UV) component is not necessary. Therefore, in this case, the conditions Shape_enable & Chroma_shape_enable 105 and 115 in Figures 1A and 1B become 0, and the shape information of the chrominance (UV) component is not encoded.
When the wavelet transform filter is an odd symmetry filter, the wavelet transforms 707, 708, 709, 713, and 714 between the layers do not become OR operations. When the shape information 712 of the chrominance (UV) component of the uppermost layer is downsampled by the OR operation from the shape information 706 of the luminance (Y) component, since some values of the shape information items 710 may be and 711 of the chrominance (UV) component are downsampled in the ratio 4: 1 not appearing in the shape information items 704 and 705 of the luminance (Y) component, the color component of the shape decoding fades. In order to reduce this effect, when the image of the uppermost layer of the chrominance (UV) component is obtained, the shape information obtained by downsampling the shape information of the luminance (Y) component by means of the OR operation is added to blocks added according to the number of layers, the texture component is obtained by realizing horizontal and vertical fill, and then the shape information of the chrominance (UV) component is encoded.
Fig. 8A and 8B are flow charts describing scalable encoding / decoding methods in units of a block. Referring to Fig. 8A, a binary alpha block (BAB) data 801 is framed (step 802). An encoding mode is determined (step 803). When the encryption mode is the ISL encryption mode, the ISL encryption is performed (step 804). When the encryption mode is not the ISL encryption mode, RSL encryption is performed (step 805). In the ISL mode, a pixel is encoded using the relationship between the pixel to be encoded and left and right or upper and lower pixels. In RSL mode, the relationship between a pixel to be encoded and pixels from lower layers is utilized. After encoding the BAB data, after the pixel encoding has been performed to the end of the image, the pixel encoding has ended and the encoded bitstream 809 is performed. If the pixel encoding has not been performed to the end of the image, the steps after step 802 are again performed with respect to the following BAB data (step 808).
Fig. 8B shows processes reversed from the processes of FIG. 8A. A framing is set around BAB data to be restored by receiving the bitstream 810 encoded in Fig. 8A and a lower layer or a previously restored BAB data 811 (step 812). The encryption mode is decoded (step 813). When the encoding mode is the ISL mode, the ISL decoding is performed (step 814). When the encoding mode is not the ISL mode, the RSL mode is performed (step 815). After decoding the BAB data, after decoding has been performed to the end of the picture, decoding is ended, whereby recovered shape information 817 is obtained. If the decoding has not been performed to the end of the image, the processes after step 812 are repeated after receiving a next input bitstream 810 (step 816).
Fig. 9 describes processes of setting the framing of the BAB of FIG. 8 in detail. In order to encode a pixel of a BAB, context information referring to pixels around the pixel to be encoded must be obtained. However, the pixel on the BAB's frame should not have pixels for obtaining the context information.
Therefore, a framing area for the BAB is set before encoding the pixel. Fig. 9A shows the framing of an 8x8 block of a lower layer. Since all shape information items of lower layers are available, the values of pixels around the pixel on the BAB boundary become as a lxl above left boundary A 902, an 8x1 upper boundary B 903, a lxl above right boundary C 904, a 1x8 left boundary D 905 , a 1x8 right boundary E 906, a lxl down left boundary F 907, an 8x1 lower bound G 908, and a lxl down right bound H 909 of a BAB 901. When at that moment the values of the pixels around the pixel on the boundary of the BAB outside the encoding of the input image, the values are determined to be 0.
Fig. 9B describes a method for setting a framing area of a 16 x 16 BAB of a current layer for the ISL encoding mode. The values of the pixels restored in the previous shape block are used as a 1x2 above left boundary A 911, a 16 x 2 upper boundary B 912, a 1x2 above right boundary C 913, and a 1 x 16 left boundary D 914 of a BAB 910. A 1 x 16 right bound E 915 and a 16 x 1 lower bound G 917 of the BAB 910 are obtained by upsampling the 1 x 8 right bound E 906 and the 8 x 1 lower bound G 908 of the BAB shown in Fig. 9A. of the lower layer. Namely, the 1 x 16 right bound E 915 and a 16 x 1 lower bound G_917 of the BAB 910 are obtained by simply repeating one to two pixels when the frames are upsampled in the 1: 2 ratio. The values of the 1 x 1 down left boundary F 907 and the 1 x 1 down left bound F 907 and the 1 x 1 down right bound H 909 of the lower layer are used as the values of the 1 x 1 down left bound F 916 and the 1 x 1 down right boundary H 918 of the current layer.
Fig. 9C describes a method for setting a framing area of the 16 x 16 BAB of the current layer for the RSL encoding mode. The values of the pixels restored in the previous BAB are used as a 1 x 1 above left boundary A 920, a 16 x 1 upper boundary B 921, a 1 x 1 above right boundary C 922, and a 1 x 16 left boundary D 923 of a BAB 919, such as in the ISL mode of FIG. 9B. The dimensions of the framing areas A 920, B 921 and C 922 in Fig. 9C differ from those in Fig. 9B. This is because the position of the context information of the ISL mode is different from the context information of the RSL mode. A 1 x 16 right bound E 924 and a 16 x 1 lower bound G 926 of the BAB 919 are obtained by upsampling the 1 x 8 right bound E 906 and the 8 x 1 lower bound G 908 of the lower layer shown in Fig. 9A. The values of the 1 x 1 down left boundary limit F 907 and the 1x1 down right boundary limit H 909 are used as the values of a 1x1 down left boundary limit F 925 and a 1x1 down right boundary limit H 927.
Fig. 10 shows the conditions of determining whether decoding mode is the ISL method or the RSL method in step 803 of FIG. 8 with respect to a BAB. These conditions should be checked using the AIFM and the two layer framing areas as inputs. The 1 x 1 pixel PL 1001 of the BAB f ((i, j) of the lower layer corresponds to 1002 with the 2x2 block of pixels PQ, Pj, P2, P3 1003 of the BAB f2 (i, j) of the current layer. Reference numbers 1007 and 1008 of the BAB f2 (i, j) are obtained by repeated upsampling of the reference numbers 1004 and 1005 indicating the boundaries of the BAB ft (i, j) The reference number 1009 indicating the boundary of the current layer obtained from reference numeral 1006 which indicates the boundary of the lower layer Equation 1 is checked to determine the coding mode.
[Equation 1] conditionl = (f2 (2i, 2j) == f, (i, j)) condition =! (! (f2 (2i, 2y) Θ f2 (2i + 2, 2y) & & (f, (2i + 1, 2y)! = f2 (2i, 2y)) conditions =! (! (f2 (2i, 2y) ) Θ f2 (2i, 2j + 2) & & (f2 (2i, 2j + 1)! = F2 (2i, 2j)) condition4 =! (! (F2 (2i + 1,2j) Θ f2 (2i + 1) , 2j + 2) & & (f2 (2i + 1,2j + 1)! = F2 (2i + 1,2j))
In addition, Φ stands for an exclusive OR operation and has the value 1 when two operands differ (1 and 0 or 0 and 1). The && operation stands for an AND operation and the operation! stands for a NON operation. When the four conditions of Equation 1 are met, the BAB is encoded in the ISL encoding mode. When all four conditions of Equation 1 are not met, the BAB is encoded in RSL encoding mode.
Condition 1 is satisfied when the pixel PL is the same as the pixel PQ. Therefore, in the case of the ISL mode, Pt is not encoded. Generally, in the case of the shape information, there is a high degree of coherence between the pixel currently encoded and the left and right or upper and lower pixels. Therefore, when the left and right pixels of the pixel to be encoded have the same value or the pixels above and below the pixel to be encoded have the same value, there is a high probability that the values are equal to the value of the pixel to be encoded. Conditions 2, 3 and 4 serve to check whether the two adjacent pixels have the same value and the value of the pixel to be encoded in the BAB to be encoded is equal to the value of the two adjacent pixels. Condition 2 serves to check whether the pixels to the left and right of the pixel Pj to be encoded have the same value and the value is equal to the value of the pixel to be encoded. Condition 3 serves to check whether pixels above and below the pixel P to be encoded. have the same value and the value is equal to the value of the pixel to be encoded. Meeting Conditions 2, 3 and 4 means that Pj, P2 and P3 are encoded only when the left and right or upper and lower pixels have different values. It is possible to improve the encoding efficiency in the ISL encoding mode by reducing the number of Pj, P2 and P3 to be encoded using the relationship between interleaved scan lines. The RSL encoding mode is applied when the position of the pixel is changed to the low-frequency bandwidth during the wavelet transformation and one or more of Conditions 2, 3 and 4 are not met. In that case, all pixels P0, Pp P2 and P3 of the current layer are encoded using the values of the pixels of the lower layers. Encoding mode information is encoded using the arithmetic encoding in the pixel encoding. The probability distribution of the arithmetic encoder for the BAB encoding mode is as follows.
Static unsigned int scalable_bab_type_prob [2] = {59808, 44651};
Fig. 11 is a flow chart describing an encoding method in the ISL mode. When BAB data 1101 framed by the method of Fig. 9B is received, the respective pixels in the BAB are scanned in the encoding order of the ISL encoding mode (step 1102). The encoding sequence of the ISL encoding mode is shown in Fig. 15A. Looking at the relationship between pixels of different layers such as the reference number 1002 of Fig. 10, P0 is not encoded since P0 is predicted as Pj and the value P, is encoded first. Then P and P3 encoded sequentially. Namely, when the value of the pixel to be encoded is the value of P () (step 1103), the pixel is not encoded. If the value of the pixel to be encoded is not the value of P (), it is checked whether the pixel to be encoded is the value of P () (step 1104).
When the value of the pixel to be encoded is the value of P, it is checked whether the pixels to the left and right of the pixel to be encoded have the same value (step 1105). When the pixels to the left and right of the pixel to be encoded have the same value, the value of Pj is not encoded. When the pixels to the left and right of the pixel to be encoded have different values, the context information for encoding the pixel and the probability value for the arithmetic encoding are calculated and the value of Pj is arithmetically encoded (steps 1106 and 1107).
The context information for encoding the value of the pixel Pj is shown in Fig. 16A. The context information is obtained by Equation 2 using 7 pixels around the pixel to be encoded.
[Equation 2]
It is possible to obtain the probability distribution of the value of the pixel to be encoded according to the value of the context information showing the arrangement of the pixels around the pixel to be encoded using the probability distribution as an input. When there are many 1s around the pixel to be encoded, there is a very high probability that the pixel to be encoded is 1. Therefore, it is possible to reduce the amount of bits when 1 is actually encoded. In another type of context, it is also possible to effectively reduce the amount of bits of the pixel to be encoded. The probability distribution according to the context information of the value of Pj is used in the ISL encoding mode Probability 1> static unsigned int scalable_xor_prob_l [128] = {65476, 64428, 62211, 63560, 52253, 58271, 38098, 31981, 50087, 41042 , 54620, 31532, 8382, 10754, 6917, 63834, 50444, 50140, 63043, 58093, 45146, 36768, 13351, 17594, 28777, 39830, 38719, 9768, 21447, 12340, 9786, 60461, 41489, 27433, 53893 , 47246, 11415, 13754, 24965, 51620, 28011, 11973, 29709, 13878, 22794, 24385, 1558, 57065, 41918, 25259, 55117, 48064, 12960, 19929, 5937, 25730, 22366, 5204, 32865, 3415 , 14814, 6634, 1155, 64444, 62907, 56337, 63144, 38112, 56527, 40247, 37088, 60326, 45675, 51248, 15151, 18868, 43723, 14757, 11721, 62436, 50971, 51738, 59767, 49927, 50675 , 38182, 24724, 48447, 47316, 56628, 36336, 12264, 25893, 24243, 5358, 58717, 56646, 48302, 60515, 36497, 26959, 43579, 40280, 54092, 20741, 10891, 7504, 8109, 30840, 6772 , 4090, 59810, 61410, 53216, 64127, 32344, 124 62, 23132, 19270, 32232, 24774, 9615, 17750, 1714, 6539, 3237, 152}; When the value of the pixel to be encoded is the value of P whether P3 is checked whether the pixels above and below the pixel to be encoded have the same value (step 1108). When the pixels above and below the pixel to be encoded have the same value, the value of Pz or P3 is not encoded. When the pixels above and below the pixel to be encoded have different values, the context information for encoding the pixel and the probability value for performing the arithmetic coding on the pixel are calculated and the arithmetic coding on the value from P or P3 performed (steps 1109 and 1110). The context information for encoding the pixel values P and P3 is obtained by Equation 2 using 7 pixels around the pixel to be encoded as shown in Fig. 16B. The probability distribution according to the context information of the values Pz and P3 used in the ISL encoding mode is as follows. «Probability distribution 2> static unsigned int scalable_xor_prob_23 [128] = {65510, 63321, 63851, 62223, 64959, 62202, 63637, 48019, 57072, 33553, 37041, 9527, 53190, 50479, 54232, 12855, 62779, 63980 , 31847, 57591, 64385, 40657, 8402, 33878, 54743, 17873, 8707, 34470, 54322, 16702, 2192, 58325, 48447, 31317, 45687, 44236, 16685, 24144, 34327, 18724, 10591, 24965, 9247 , 7281, 3144, 5921, 59349, 33539, 11447, 5543, 58082, 48995, 35630, 10653, 7123, 15893, 23830, 800, 3491, 15792, 8930, 905, 65209, 63939, 52634, 62194, 64937, 53948 , 60081, 46851, 56157, 50930, 35498, 24655, 56331, 59318, 32209, 6872, 59172, 64273, 46724, 41200, 53619, 59022, 37941, 20529, 55026, 52858, 26402, 45073, 57740, 55485, 20533 , 6288, 64286, 55438, 16454, 55656, 61175, 45874, 28536, 53762, 58056, 21895, 5482, 39352, 32635, 21633, 2137, 4016, 58490, 14100, 18724, 10461, 53459, 15490, 57992, 15128 , 12034, 4340, 1859, 5794, 6785, 2412, 35};
When a pixel value is encoded, it is checked whether the pixel is the last pixel of the BAB data (step 1111). When the pixel is the last pixel, an encoded bitstream 1112 is obtained. If the pixel is not the last pixel, processes after step 1102 are repeated with respect to a new pixel.
Fig. 12 is a flow chart describing the ISL mode encoding method in processes that are inverse to the processes of FIG. 11. When an encoded bitstream 1201 and a framed BAB data 1202 are input, the encoded bitstream 1201 and a framed BAB data 1202 scanned in the encoding order of the ISL encoding mode (step 1203). The decoding of the ISL encoding mode is performed in the order shown in Fig. 15A. It is checked whether the pixel value to be restored is the value P0 (step 1204). When the pixel value to be restored is the value P (), the pixel value is restored to the value PL of the lower layer (step 1205). If the pixel value to be restored is not the value PQ, it is checked whether the value of the pixel to be restored is the value of Pj (step 1206).
When the value of the pixel to be restored is the value of Pj, it is checked whether the left and right pixel values (C3 and C2 in Fig. 16A) of the pixel value are equal to each other (step 1207). When C3 and C2 are equal to each other in Fig. 16A, Pj is restored to the left and right pixel values C3 and C2 (step 1208). When C3 is different from C1, the context information for decoding the pixel and the probability value for performing the arithmetic decoding on the pixel value are calculated and P1 is restored by performing the arithmetic decoding (steps 1209 and 1210). The context information for decoding the pixel value Pt is obtained by Equation 2 using 7 pixels shown in Fig. 16A around the pixel to be decoded. The probability distribution according to the context information of the value Pj used in the ISL encoding mode is the same as the <Probability distribution 1>.
When the pixel value to be restored is the value of P2 or P3, it is checked whether the pixel values (Cj and C5 of Fig. 16B) above and below the pixel value are equal to each other (step 1211). When C is equal to C5 in Fig. 16B, the value of P or P3 restored to the parent or minor pixel value Cj or C5 (step 1212). When Cj is different from C ^, the context information for encoding the pixel and the probability value for performing the arithmetic decoding on the pixel are calculated and the value of P or P3 restored by performing the arithmetic decoding (steps 1213 and 1214). The context information for decoding the values P and P3 is obtained by Equation 2 using 7 pixels depicted in Fig. 16B around the pixels to be encoded. The probability distribution according to the context information of the values P2 and P3 used in the ISL encoding mode is the same as <Probability distribution 2>. When a pixel is restored by the above decoding process, it is checked whether the restored pixel is the last pixel of the BAB data (step 1215). When the pixel is the last pixel of the BAB data, recovered BAB data 1216 is obtained. If the pixel is not the last pixel of the BAB data, the processes after step 1203 are repeated with respect to a new pixel.
Fig. 13 is a flow chart describing the RSL mode encoding method. When a BAB data 1301 framed by the method of Fig. 9C is input, the respective pixels in the BAB are scanned in the encoding order of the RSL encoding mode (step 1302). The encoding order of the RSL encoding mode is shown in Fig. 15B. It is checked whether the pixel value to be encoded is the value of P0 (step 1303). When the pixel value is the value of PQ, it is checked whether the pixel value is PL 1, the context information for coding the pixel and the probability value for performing the arithmetic coding on the pixel are calculated and the arithmetic coding on the value of PQ performed (steps 1305 and 1306).
When the odd symmetry filter is used in the wavelet transformation from the upper layer to the lower layer, the position of an odd numbered point is changed in the low frequency bandwidth. When the even symmetry filter is used, the first odd-numbered pixel is changed to the low-frequency component. In the above two cases, since 0 is changed to 1 in the low-frequency region of the lower layer, unlike the encoding, there is a possibility that the value of P0 is only changed when the pixel PL of the low-frequency layer 1 when the pixel is restored. Therefore, the pixel is not encoded when the pixel is P (0.
The context information for the RSL encoding is shown in Fig. 16C. The context information is obtained by Equation 2 using 5 pixels (C4, Cc, C6, C7, and C8 of Fig. 16C) of lower layers and four pixels (CQ, C {, C2, and C3 of 16C) around the pixel of the current layer to be encoded. The probability distribution according to the context information of the pixel to be encoded for encoding the RSL method is as follows.
<Probability distribution 3> static unsigned int scalable_full_prob [512] = {65524, 65478, 65524, 32768, 32768, 32768, 65464, 32768, 32768, 32768, 32768, 32768, 32768, 32768, 32768, 32768, 64349, 21570, 65373 , 32768, 32768, 32768, 64685, 32768, 32768, 32768, 32768, 32768, 32768, 32768, 32768, 32768, 65246, 64528, 60948, 64479, 26214, 32768, 16843, 32768, 32768, 32768, 32768, 32768 , 32768, 32768, 32768, 32768, 63498, 10078, 50130, 4010, 16384, 32768, 2773, 1316, 32768, 32768, 32768, 32768, 32768, 32768, 32768, 32768, 47058, 21126, 35436, 4626, 37137 , 24876, 27151, 11722, 54032, 43538, 25645, 6858, 42976, 36599, 44237, 15996, 38096, 25303, 21007, 5307, 8618, 19293, 3021, 2416, 24740, 35226, 4369, 24858, 19920, 12336 , 11718, 4390, 45487, 5313, 26464, 5354, 33556, 19876, 33099, 9713, 15749, 7876, 40867, 36223, 27065, 10377, 42337, 9907, 52230, 2688, 20906, 1269, 8507, 8987, 2929 , 767, 23609, 18238, 18787, 32074, 24720, 10786, 34351, 1489, 65519, 65524, 65363, 32768, 32768, 32768, 64171 , 32768, 65524, 65531, 32768, 32768, 32768, 32768, 32768, 32768, 65140, 50762, 65102, 32768, 32768, 32768, 62415, 32768, 50218, 41801, 32768, 32768, 32768, 32768, 32768, 32768 , 64963, 65368, 59158, 64444, 32768, 32768, 15320, 32768, 65432, 65490, 65054, 65216, 32768, 32768, 32768, 32768, 61586,52398,43664, 16798, 4369, 32768, 2261, 8287, 46251 , 53036, 33737, 26295, 32768, 32768, 32768, 32768, 60268, 31543, 25894, 11546, 32094, 35000, 19152, 15313, 60467, 30803, 30501, 22027, 55068, 27925, 50009, 14617, 62716, 34972 , 23572, 13523, 5767, 22408, 2297, 7880, 48362, 21477, 15490, 21907, 46113, 3403, 36430, 2534, 46798, 6086, 28318, 13929, 16384, 25405, 19032, 14342, 31875, 8303, 43054 , 27746, 30750, 11592, 45209, 6647, 49977, 8979, 19805, 3636, 7526, 13793, 1726, 874, 43735, 10691, 21314, 15586, 26597, 1637, 46751, 763, 65521,64662, 65522. 32768 , 65448, 32768, 65519, 32768, 65519, 32768, 65425, 32768, 65518, 32768, 65531, 32768, 64061,24926, 65438, 32768, 65162, 32768, 65439, 32768, 65387, 32 768, 65036, 32768, 65414, 32768, 65505, 32768, 65211, 61440, 64686, 63898, 31500, 32768, 51716, 32768, 54459, 32768, 50302, 32768, 36409, 32768, 39275, 32768, 62824, 17179, 55885, 9925, 36231, 32768, 39442, 5152, 44395, 32768, 40960, 32768, 31267, 32768, 40015, 32768, 37767, 21420, 58706, 9997, 47907, 16277, 31559, 4134, 63689, 53786, 29789, 15490, 53468, 24226, 25698, 10158, 24246, 19795, 41227, 10169, 15452, 11259, 5422, 1509, 42807, 52609, 37449, 27173, 20776, 10504, 18256, 3144, 40953, 4656, 62176, 6482, 35639, 13355, 33765, 4474, 44149, 27748, 48824, 31490, 40902, 12039, 22817, 2077, 46515, 3789, 49266, 5081, 15143, 12674, 4434, 337, 43468, 28306, 31069, 29457, 37942, 6798, 8863, 280, 65500, 65364, 65427, 32768, 64860, 32768, 65280, 32768, 65533, 65529, 65379, 32768, 65499, 32768, 65510, 32768, 63851,34810, 65361, 32768, 64111, 32768, 65290, 32768, 63376, 46390, 64746, 32768, 65377, 56174, 65475, 32768, 65130, 65036, 61752, 64444, 23546, 32768, 37897, 32768, 64164, 65499, 59443, 6 5255, 36359, 32768, 41795, 32768, 60451, 46151, 49242, 18561,21845, 32768, 24846, 11969, 55142, 53590, 37926, 25977, 41804, 32768, 37615, 32768, 60289, 26751, 45180, 16830, 39394, 34740, 24237, 7623, 65005, 61212, 31154, 37511,63413, 31640, 57423, 8360, 61019, 31563, 47345, 23577, 15308, 13653, 17255, 5024, 59892, 49587, 26933, 31950, 54850, 8587, 41904, 1255, 56552, 9777, 52370, 16762, 17118, 35915, 33507, 7744, 54902.34383, 54875, 40718, 54047, 22218, 48436, 4431, 50112, 7519, 24647, 6361, 13569, 6303, 5215, 1078, 49640, 21245, 39984, 26286, 45900, 4704, 23108, 206};
When the pixel value to be encoded is not the value of P (), the context information for encoding the pixel and the probability value for performing the arithmetic coding on the pixel are calculated and an arithmetic coding is performed on the values of Pj , P2, or P3 (steps 1307 and 1308). Each time a pixel value is encoded, it is checked whether the pixel is the last pixel of the BAB data (step 1309). When the pixel value is the last pixel, an encoded bitstream 1310 is obtained. If the pixel is not the last pixel, processes after step 1302 are repeated with respect to a new pixel.
Fig. 14 is a flow chart describing an RSL mode encoding method in processes inverse to the processes of FIG. 13. When an encoded bitstream 1401 and framed BAB data 1402 are input, the encoded bitstream 1401 is scanned in the encoding order of the RSL encoding mode (step 1403). The decoding of the ISL encoding mode is performed in the order shown in Fig. 15B. It is checked whether the restored pixel value is the value of P () (step 1404). When the pixel value to be restored is the value of PQ, it is checked whether the value of the pixel of the underlying layer P corresponding to the current pixel position is 0 (step 1405). When the value of P is 0, the value of PQ is restored to 0. When the value of P is [1, the context information for recovering the pixel and the probability value for performing the arithmetic decoding on the pixel and arithmetic decoding is performed on the value of P0 (steps 1407 and 1408).
The context information for the RSL encoding is shown in Fig. 16C. The context information is obtained by Equation 2 using five pixels (C4, C5, C6, C7, and C8 of Figure 16C) from lower layers and four pixels (C0, Cj, C , And C3 of Fig. 16C) around the pixel of the current layer to be decoded. The probability distribution according to the context information of the pixel to be decoded for the encoding of the RSL method is equal to <Probability distribution 3>. When the pixel value to be decoded is not the value of P0, the context information for restoring the pixel and the probability value for performing the arithmetic decoding on the pixel are calculated and the arithmetic decoding is performed on the values of Pt, P2 , or P3 (steps 1409 and 1410). Each time a pixel is restored by the above decoding processes, it is checked whether the pixel is the last pixel of the BAB data (step 1411). When the pixel is the last pixel of the BAB data, recovered BAB data 1412 is obtained. If the pixel is not the last pixel of the BAB data, processes after step 1403 are repeated with respect to a new pixel.
Fig. 17A shows the structure of an object-based still image encoder that includes a tile operation. An entered object 1700 is divided into tiles and a control signal is encoded. The respective tiles (tile 0, tile 1, ..., and tile M1) are encoded by the still picture encoder 1703 as shown in Fig. 1, and the encoded bitstreams are connected together by the lower multiplexer 1704. When more input objects must be encoded (for example, another input object 1710), the input objects are encoded using the same method, and the encoded bitstreams are obtained. The encoded bitstreams are connected to a bitstream 1730 to be transmitted by the parent multiplexer 1720.
Fig. 17B is a block diagram showing processes for obtaining a recovered image from the encoded bitstream 1730, which processes are inverse to the processes of FIG. 17A. The input bitstream 1730 is divided into encoded objects by means of an overlying multiplexer 1740. Each object is divided into a control signal component and a tile component by a demultiplexer 1750 below, reconstructed and restored. A restored image related to each object is obtained by reconstructing the respective tile components (tile 0, tile l, ..., tile M-1). The respective recovered images are reconstructed by an object assembler 1780 into a final output image 1790 and the final output image is output.
Fig. 17C shows a result of dividing an object of arbitrary shape into tiles. When an input image (C01) is divided into tiles, there are tiles (C02) in which no shape information is present, tiles (C03) in which shape information is partially present, and a tile (C04) in the object. (C05) indicates tiles to be encoded. Each tile is coded independently by an encoder such as an input image. A control signal required to encode the tiles is additionally encoded to control an output image.
Fig. 18A through 18F show bitstream syntaxes showing the entire operation of the scalable encoder with respect to a still image using the wavelet transform of the present invention. The bit stream shows data compressed by an encoder in the form of binary values of "0" and "1".
StillTextureObjeet O of Fig. 18A contains a bitstream syntax showing the encoder processing processes. (L001) specifies a start code that serves to distinguish the object to be encoded from other objects. When there are several object information items in the encoded information, the object information items are classified by the start code. Therefore, each object is given a unique name to distinguish one object from another. StillTextureHeader () of (L002), which is described in detail in Fig. 18B, indicates various additional information items required to perform the encoding. (L003) through (L007) provide information about the size of the input image when input shape information is not arbitrary shape. When the input image does have an arbitrary shape, processes (L008) through (L019) are executed. (L008) through (L019) contain the starting point and size of a framing area surrounding the shape information and processes for encoding the shape information when tiles are not used. Decoding the shape information, shape_object_decoding () is described in detail in Fig. 18E. In Fig. 18C, detailed (L020) and (L021) indicate various control signals required for tile operation. Encoding of the information in the tile is accomplished in StillTextureTile () of (L022), which is described in detail in Fig. 18D.
StillTextureHeader () of Fig. 18B provides various additional information items for performing the encoding. (L101) indicates whether the tile operation is applied. (L102) gives the identifier (ID) of the object to be encoded. marker_bit of (L103) is a value to avoid a start code in any other encoded data. (L104) through (L108) indicate various additional information items related to the wavelet transformation and encoding. (L109) indicates whether the entered object has shape information and whether the shape information is to be encoded. (L110) through (L128) refer to the input of scale-related additional information and filter coefficients.
StillTextureTileControl () of Fig. 18C indicates various control signals required for tile operation. (L201) to (L206) indicate the dimensions of the tiles in vertical and horizontal direction, and the number of tiles to be coded in the input image. In order to directly restore a tile randomly indicated by a user in the bitstream, (L207) through (L215) indicate the amount of bits used to encode the respective tiles in units of a byte. A value of 32 bits is expressed by two 16 bit values.
StillTextureTile () of Fig. 18D indicates actual processes of encoding shape information and texture information into a tile. (L303), which indicates the start code of each item of tile information, allows a user to distinguish desired tile information from another item tile information and restore the desired tile information along with the identifier (ID) of (L304). (L307) indicates three types of tiles as shown in Fig. 17C. (L310) to (L312) show processes for encoding shape information in a tile, wherein the scalable encoder of the present invention is used only in the case where the shape information is partially contained in a tile. StillTextureDecoding () of (L314) shown in Figure 18F indicates processes of encoding the texture component using position information of an encoded shape coefficient. StillTextureDecoding () will not be described in detail as it is not directly related to the content of the present invention.
Shape_object_decoding () of Fig. 18E indicates processes of scalable decoding of the shape information. (L401) through (L417) show processes of encoding the shape information of a base layer. (L406) and (L407) indicate the number of BABs in vertical and horizontal directions. When the tile operation is not applied (tiling_disable == l), object_width (L013) and object_height (L015) of Figure 18A are applied. When tile operation is applied, tile_width and tile_height of (L201) and (L203) of Fig. 18C are applied. Also wavelet_decomposition_levels of (L106) of Figure 18B indicating the number of layers are used. The processes of encoding the shape information of the base layer are shown as follows.
[Equation 3]
where >> stands for a sliding operator.
(L417) through (L439) indicate processes of scalable encoding of the shape information in the overlying layers. In this case, the encoding is accomplished by the ISL encoding or the RSL decoding according to the encoding mode. (L421) and (L422) and (L434) and (L435) indicate the numbers of BABs to be encoded in the upper layers in the vertical and horizontal directions. object_width (L013) and object_height (L015) of Fig. ISA are used as the numbers of the AIFMs to be encoded in the upper layers in vertical and horizontal direction when the tile operation is not applied (tiling_disable == l) as in the base layer. tile_width (L201) and tile_height (L203) of Fig. 18C are used as the numbers of the AIFMs to be encoded in the upper layers in the vertical and horizontal directions when the tile operation is applied. Also, wavelet_decomposition_levels of (L106) of Figure 18B is used, which indicates the number of layers of the wavelet transform. The processes of scalable encoding of the shape information in the overlying layers are shown as follows.
[Equation 4]
where L indicates to which numbered layer of the overlying layers the shape information belongs. bab_size indicates the size of the BAB of the parent layer in the vertical and horizontal directions, which can be displayed according to the size of the input image as follows. [Equation 5]
The above is used to prevent the encoding efficiency from deteriorating by variably increasing the size of the blocks according to the size of the input image, since the encoding efficiency deteriorates when a large input image is divided into small blocks and the blocks are encoded.
enh_binary_arithmetic_decode () of (L424) and (L436) denotes actual processes of encoding pixels in the BAB by the scalable encoder using the context information of the pixels around the pixels to be encoded and the arithmetic encoder. A first value indicates whether the BAB is encoded in the ISL encoding mode or the RSL encoding mode. The other values include processes of performing arithmetic coding on pixels of an image including the shape information using the context information of the pixels around the pixel to be coded.
StillTextureDecoding () of Figure 18F indicates processes of wavelet-based scalable encoding of the texture information using the shape information obtained from Figure 18E. A detailed description of the processes will be omitted since the processes are outside the scope of the present invention.
According to the scalable encoder of a still image using wavelets, it is possible to encode pixels efficiently by reducing the number of pixels to be encoded using the characteristic properties between ISL pixels of the layer to be encoded or pixels between two layers when encoding the shape information between the respective layers. Therefore, it is possible to sequentially restore the shape information of the still image as well as the texture information of the still image by performing the scalable encoding according to the resolution of an image, which can be effectively used for searching an image in a database with large capacity, such as a digital library. It is possible to reduce the number of pixels to be encoded and simplify the decoding processes by using the scalable encoder to encode the shape information of the present invention compared to other encoders.
Also, according to the present invention, it is possible to restore desired parts of the image of the encoded bitstreams with a small amount of computation and within a very short time by using a tile operation on which only the specific parts of the input image are applied independently. The application field related to encoding the shape information is extended to still images of any shape by the tile operation. In particular, it is possible to restore desired parts by reducing the memory capacity and the amount of computation related to an object in a large image. Such an extension can be efficiently applied to image communication such as the international telecommunications 2000 (IMT-2000) terminal. Since the resolution of a terminal is limited by the bandwidth of a channel, it is preferable to use the tile operation to create part of a large image.
The coding efficiency of the present invention is shown in the following test. Tables 1 and 2 show the shape information amount in bits of the respective layers with respect to the child image and the Fish & Logo image (352 x 240, SIF format image), respectively. When the odd symmetry filter and even five-layer symmetry filter are used, the number of bits of the shape information in the respective layers of the scalable encoder of the present invention is compared with the total shape number of bits of a general shape information encoder such as context -based arithmetic coder (CAE). In the case of the odd symmetry filter, the shape information of the chrominance (UV) component is encoded. In the even symmetry filter, it is possible to restore the shape information of the chrominance (UV) component by encoding only the shape information of the luminance (Y) component. Nevertheless, the result of the test shows that the two filters show similar coding performance. This is because the number of pixels to be encoded in the even symmetry filter is almost equal to the number of pixels to be encoded in the odd symmetry filter, since in the case of the even symmetry filter there are more RSL encoding modes . It is noted that the number of bits increases by about 17 to 25% in the encoder of the present invention compared to the CAE.
However, since it is possible to reduce the number of bits and the complexity with respect to the shape information in underlying layers, it is possible to realize an efficient resolution scalable encoder together with the encoder with respect to the texture information.
[TABLE 1]
Number of bits form information of each layer of children's image
[TABLE 2]
Number of bit shape information of each layer of Fish & Logo image
Fig. 19A and 19B show the results of restoring the image of children in the layer 3. FIG. 20A and 20B show the results of restoring the Fish & Logo image in the layer 3. FIG. 19A and 20A show the results of downsampling the shape information on the luminance (Y) component by performing the OR operation when the shape information on the chrominance (UV) component is obtained. It is noted that the color in the middle layer of the resolution scalable structure is faded since the chrominance (UV) image value corresponding to the boundary of the luminance ____ (Y) image does not exist. Fig. 19B and 20B show the results of removing the color blurring through the chrominance (UV) image compensation processes as shown in FIGS. 2 and 3.
- conclusions -

权利要求:
Claims (34)
[1]
A method for scalable encoding of still image shape information using a wavelet transform, comprising the steps of: (a) wavelet transforming and scalable encoding of shape information about a luminance (Y) component; (b) wavelet encoding texture information of the luminance (Y) component using the shape information of the luminance (Y) component transformed in step (a); (c) filling in shape information and texture information on a chrominance (UV) component using the shape information on the chrominance (UV) component; (d) wavelet transforming and scalably encoding the filler shape information about the chrominance (UV) component; and (e) wavelet encoding the texture information about the chrominance (UV) component using the shape information about the wavelet transformed chrominance (UV) component in step (d).
[2]
A method according to claim 1, characterized in that steps (a) and (d) each consist of the steps of: (a) obtaining respective layers by form-fitting discretely transforming input shape information; (a2) encoding the low-frequency bandwidth shape information of the bottom shape layer; (a3) scalably encoding the low-frequency bandwidth shape information of each layer using the low-frequency bandwidth shape information of the bottom layer, with respect to each of the shape layers except the bottom shape layer; and (a4) transmitting the coded shape information from the bottom layer to the top layer.
[3]
A method according to claim 2, characterized in that the step (a3) consists of the steps of: (a3l) dividing the low-frequency bandwidth shape information of the current layer into blocks and the low-frequency bandwidth shape information of lower layers; (a32) framing the respective blocks in the shape information; and (a33) determining the encoding mode, performing arithmetic encoding on the determined encoding mode, and encoding the framed block according to the determined encoding modes, with respect to each of the framed blocks.
[4]
The method of claim 3, wherein when a 1x1 pixel value PL of a binary alpha block (BAB) f (i, j) of a lower layer corresponds to 2x2 pixel values PQ, Pj, P2 and P3 of a (BAB) f2 <i, j) of the current layer, the encoding mode is determined to be an interleaved scan line (ISL) mode when all of the following conditions are met with respect to all pixels in the BAB of the lower layers, and it is determined that the encoding mode is a raster scan line (RSL) mode when all of the following conditions are not met. conditionl = (f2 (2i, 2j) = = £, (ƒ, j)) Condition2 =! (! (f2 (2i, 2y) 0 £, (2 / + 2, 2y) && (£ 2 (2M, 2y) = £, (2 / .2y)) condition3 =! (! (£, (2 /, 2j) 0 £, (2 /, 2y-2) && (£ 2 (2 /, 2 / + 1)! = £ 2 (2 /, 2;)) Condition4 =! (! (£ 2 (2 / -1, 2y) Θ £, (2 / * 1, 2y'-2) && (£ 2 (2M, 2y'-1)! = F2 {2 / + 1, 2y)
[5]
The method of claim 4, wherein, when the encoding mode is the ISL encoding mode, with respect to each pixel of the block, the step (a33) consists of the steps of: (a331) not encoding P0 when the pixel value to be encoded P0 is; (a332) calculating context information showing the arrangement of pixels of the current layer around the pixel to be encoded and a probability value to perform arithmetic encoding on the pixel to be encoded only when the left and right pixel values of the encode pixel value differ from each other and perform the arithmetic coding on Pj when the pixel value to be coded is Pj; and (a333) calculating the context information showing the arrangement of pixels of the current layer around the pixel to be encoded and the probability value to perform the arithmetic encoding on the pixel to be encoded only when the pixel values above and below the encode pixel value differ from each other and the arithmetic encoding at P or P3 when the pixel value to be encoded is Pp or P3.
[6]
The method of claim 4, wherein when the encode mode is the RSL mode with respect to each pixel of the block, the step (a33) consists of the steps of: (a331) not encoding P () when the encoding pixel value is P0 and the corresponding Pt is 0; (a332) calculating the context information showing the arrangement of pixels of the current layer and lower layers around the pixel to be encoded and the probability value for performing the arithmetic encoding on the pixel to be encoded and performing the arithmetic encoding PQ when the pixel value to be encoded is PQ and the corresponding pixel value PL is not 0; and (a333) calculating the context information showing the arrangement of the pixels of the current layer and the lower layers around the pixel to be encoded and the probability value for performing the arithmetic operation on the pixel to be encoded and the performing the arithmetic operation on Pj, P2 or P3 when the pixel value to be encoded is P (, P2 or P3.
[7]
Method according to claim 1, characterized in that the shape information about the luminance (Y) component is wavelet transformed by an even symmetry filter and encoded scalably in step (a), and steps (c) and (d) ) are not executed.
[8]
A method according to claim 1, characterized in that the step (c) consists of the steps of: (cl) obtaining downsampled shape information of shape information about the luminance (Y) component to compensate for the chrominance (UV) component from 4: 2: 0 or 4: 2: 2; (c2) dividing the downsampled shape information into blocks according to the number of layers and extending the shape information to an area containing all pixels of framing blocks partially having that shape; and (c3) obtaining texture information corresponding to the extension area by filling in the texture information of the chrominance (UV) component in horizontal and vertical direction.
[9]
A method for scalably decoding encoded shape information on a still image using wavelet transformation, which comprises the steps of: (a) scalable decoding and wavelet transforming the encoded shape information on the luminance (Y) component; (b) wavelet decoding the encoded texture information on the luminance (Y) component using the shape information on the wavelet transformed luminance (Y) component in step (a); (c) scalably decoding and wavelet transforming the encoded chrominance (UV) component shape information; and (d) wavelet decoding the encoded texture information over the chrominance (UV) component using the shape information about the wavelet transformed chrominance (UV) component in step (c).
[10]
A method according to claim 9, characterized in that steps (a) and (c) consist of the steps of: (a1) receiving coded shape information from the bottom layer to the top layer; (a2) obtaining the low-frequency bandwidth shape information of the bottom layer by decoding the coded shape information of the bottom layer; (a3) scalable decoding the low-frequency bandwidth shape information by decoding the coded shape information of each layer using the low-frequency bandwidth shape information of lower layers with respect to the respective layers except the bottom layer; and (a4) obtaining the respective layers by form-fitting discretely wavelet transforming the low-frequency bandwidth shape information of the decoded respective layers.
[11]
Method according to claim 10, characterized in that the step (a3) consists of the steps of: (a31) receiving coded shape information and dividing the shape information of the current shape layer and the shape information of the lower layers into blocks ; (a32) framing the respective blocks in the shape information; and (a33) performing arithmetic decoding on the encoding modes of the respective framed blocks and decoding the encoded shape information in each block according to the decoded encoding mode.
[12]
The method of claim 11, wherein when a 1x1 pixel value PL of a binary alpha block (BAB) fj (i, j) of a lower layer corresponds to 2x2 pixel values PQ, PT, P2 and P3 of a BAB f2 (i, j ) of the current layer, the encoding mode is determined to be an interleaved scan line (ISL) mode when all of the following conditions are included with reference pixels in the BAB of the lower layers and the encoding mode is determined as a raster scan line (RSL) mode when some of the following conditions are not met. conditionl = (£, (2 /, 2j) = = ƒ, (/. j)) condition2 =! (! (£, (2 /, 2y)) 0 £, (2 / + 2, 2y) && (£, (2 / + 1, 2y)! = £, (2 /, 2y)) condition3 =! (! (£, (2 /, 2j) © £, (2 /, 2 / + 2) && (£, (2 /, 2 / + 1)! - £, (2 /, 2f)) Condition4 =! (! (£, (2 / + 1, 2y) 0 £, (2 / + 1, 2 / + 2) && (£, (2 / + 1, 2 / + 1)! = £ 2 (2M, 2y)
[13]
Method according to claim 12, characterized in that when the encoded mode is the ISL encoding mode, with respect to each pixel of the block, the step (a33) consists of the steps of: (a331) restoring P0 by PL when the pixel value to be decoded is P0; (a332) restoring P0 by the pixel value to the left or right of the pixel value to be encoded when the pixel value to be encoded is Pj and the pixel values to the left and right of the pixel value are equal to each other, and calculating the context information representing the arrangement of the pixels of the current layer around the pixel to be encoded and the probability value for performing the arithmetic decoding on the pixel to be decoded and performing the arithmetic decoding on PI when the pixel value to be decoded is PI and the pixel values on the left and to the right of the pixel values to be decoded; and (a333) decoding P2 or P3 by the pixel value above or below the pixel value to be encoded when the pixel value to be decoded is P2 or P3 and the pixel values above or below the pixel value are equal to each other and calculating the context information which shows the arrangement of the pixels of the current layer around the pixel to be encoded and a probability value for performing arithmetic decoding on the pixel to be decoded and performing arithmetic decoding on P2 or P3 when the pixel value P to be encoded. or P3 and the pixel values above and below the pixel value are different.
[14]
The method of claim 12, wherein, when the encoding mode is the RSL encoding mode, with respect to each pixel of the block, the step (a33) consists of the steps of: (a331) restoring PQ by 0 when the decode pixel value is P0 and the corresponding Pt is 0; (a332) calculating the context information showing the arrangement of the pixels of the current layer and the lower layers around the pixel to be encoded and the probability value for performing the arithmetic decoding on the pixel to be decoded and performing arithmetic decoding at P0 when the pixel value to be decoded is P0 and the corresponding PL is not 0; and (a333) calculating the context information representing the arrangement of pixels of the current layer and the lower layers around the pixel to be decoded and the probability value for performing the arithmetic decoding on the pixel to be decoded and performing the arithmetic decoding on P] (P or P3 when the pixel value to be decoded is P ,, P or P3.
[15]
15. A device for scalable encoding of shape information on a still image using wavelet transform, comprising: a shape information scalable encoder for wavelet transform and scalable encoding of the shape information of a luminance (Y) component and a chrominance (UV) component; a chrominance (UV) image / texture filler for filling shape information and texture information of a chrominance (UV) component using a luminance (Y) component and texture information of a chrominance (UV) component with respect to 4: 2: 0 or 4: 2: 2 shape information; and a texture information wavelet encoder for wave-let encoding the texture information of the luminance (Y) component and the chrominance (UV) component using the shape information wavelet transformed by the shape information scalable encoder.
[16]
The device of claim 15, characterized in that the shape information scalable encoder comprises: a luminance (Y) shape scalable encoder for wavelet transforming and scalable encoding of the shape information of the luminance (Y) component; and a chrominance (UV) shape scalable encoder for wavelet transforming and scalable encoding the shape information of the chrominance (UV) component filled by the chrominance (UV) image / texture filler.
[17]
Device according to claim 16, characterized in that the luminance (Y) shape scalable encoder and the chrominance (UV) shape scalable encoder each comprise: a plurality of shape-adapting discrete wavelet transformers for receiving shape layers and generating of the shape layers of lower layers; a shape encoder for encoding the low frequency bandwidth shape information of the bottom shape layer; a plurality of scalable coders for scalably encoding the low-frequency bandwidth shape information of the respective layers using the low-frequency bandwidth shape information of the lower layers with respect to the respective shape layers, except for the bottom shape layer; and a multiplexer for transmitting the encoded shape information from the bottom layer to the top layers.
[18]
An apparatus according to claim 17, characterized in that each scalable decoder comprises: means for dividing into blocks the low-frequency bandwidth shape information of the current layer and the low-frequency bandwidth shape information of the lower layers; means for framing the respective blocks in the shape information; means for determining the encoding mode according to the possibility of using exclusive OR information about each pixel in the framed block; means for scanning the respective pixels in a block in the ISL order and omitting the coding of the pixels when exclusive OR information can be used and obtaining the context information and performing the arithmetic coding on the pixels when the exclusive OR information cannot be used, when the encoding mode is the ISL encoding mode; and scanning the respective pixels in a block in the RSL order, obtaining the context information and performing the arithmetic coding on the pixels when the coding mode is the RSL coding mode.
[19]
A device for scalably decoding encoded shape information on a still picture using wavelet transformation, comprising: a shape information scalable decoder for scalable decoding and wavelet transforming the encoded shape information on the luminance (Y) component and the chrominance (UV) component: and a texture information wavelet decoder for wavelet decoding encoded texture information on the luminance (Y) component and the chrominance (UV) component using the shape information wavelet transformed by the shape information scalable decoder.
[20]
The device of claim 19, characterized in that the shape information scalable decoder comprises: a luminance (Y) shape scalable decoder for scalable decoding and wavelet transforming the encoded shape information on the luminance (Y) component; and a chrominance (UV) shape scalable decoder for scalable decoding and wavelet transforming the encoded shape information about the chrominance (UV) component.
[21]
The device according to claim 20, characterized in that the luminance (Y) shape scalable decoder and the chrominance (UV) shape scalable decoder each comprise: a demultiplexer to distribute the coded shape information from the bottom layer to superimposed layers; a shape decoder for obtaining the low-frequency bandwidth shape information of the bottom layer by decoding the coded shape information of the bottom shape layer; a plurality of scalable decoders for scalably decoding the low-frequency bandwidth shape information by decoding the coded shape information of the respective layers using the low-frequency bandwidth shape information, with respect to the respective shape layers except the bottom shape layer; and a plurality of shape-matching discrete wavelet transformers for obtaining each of the shape layers by shape-matching discrete wavelet transforming the decoded low-frequency bandwidth shape information of the respective layers.
[22]
The device according to claim 21, characterized in that each scalable decoder comprises: means for receiving coded shape information and blocking the shape information of the current layer and the shape information of the lower layers; means for framing the respective blocks in the shape information, means for performing arithmetic decoding on the encoding mode determined according to the possibility of using the exclusive OR information of the respective pixels in the framed block; means for scanning the respective pixels in a block in the ISL order decoding and decoding the pixels by exclusive OR information when the exclusive OR information can be used and obtaining the context information and outputting the arithmetic decoding on the pixels when the exclusive OR information cannot be used, when the encoding mode is the ISL encoding mode; and means for scanning the respective pixels in a block, obtaining the context information, and performing arithmetic decoding on the pixels when the encoding mode is the ISL encoding mode.
[23]
A method of scalable still image encoding using wavelet transform, which consists of the steps of: (a) tiling an input object of any shape of uniform size and classifying a control component ; (b) encoding a control signal with respect to each tile; (c) wavelet transforming shape and texture information, scaling the values of the respective layers scalably, and encoding object information in a tile, with respect to each tile; and (d) sequentially connecting coded bitstreams to each tile.
[24]
The method according to claim 23, characterized in that in step (a) the size of a tile is divided by 2 with the remainder 0 and, when the number of layers of wavelet transformation is N, the size of a tile in horizontal and vertical direction is divided by 2 x (N + 1) with the remainder 0 so that the inconsistency of a chrominance (UV) component does not occur when considering the 4: 2: 0 picture format when the input picture is divided into pictures with a uniform size using a tile operation.
[25]
A method according to claim 23, characterized in that in step (c) the shape information is distributed in each tile to be encoded in a case where the tile is included in the object, a case where the shape information is partially contained in the tile, and a case where the shape information is not present in the tile, only texture information is scalable when the tile is included in the object, the shape information in the tile is scalable when the shape information is partially contained in the tile and texture information is encoded using of the scalable coded texture information and shape information.
[26]
A method according to claim 25, characterized in that when shape information is scalably encoded in a tile, the number of Μ x N binary alpha blocks (BABs) to be encoded in a base layer in vertical and horizontal direction is obtained by Equation 3 ( when M = N = 16), the number of BABs to be encoded in parent layers in vertical and horizontal direction is obtained by Equation 4, and the size of the BABs to be encoded in parent layers is obtained by Equation 5 according to the size of the object.
[27]
The method of claim 25, wherein when shape information is scalably encoded in a tile, the arithmetic coding is performed on the pixel to be encoded by the probability values of the Probability distribution 1 according to the arrangement of the pixels around the pixel to be encoded when the position of the pixel to be encoded is PI and arithmetic encoding is performed on the pixels to be encoded by the probability values of the Probability distribution 2 according to the arrangement of the pixels around the pixel to be encoded when the positions of the pixels to be encoded P2 and P3 when the encoding mode is the ISL encoding mode and the arithmetic encoding on the pixel to be encoded is performed by the probability values of the Probability distribution 3 according to the arrangement of the pixels around the pixel to be encoded when the encoding mode is the RSL encoding mode is.
[28]
28. A method of decoding a bitstream obtained by scalable encoding of a still image using wavelet transform, comprising the steps of: (a) receiving an encoded bitstream, dividing the objects into objects. encoded bitstream and classifying a control component from a number of tile components into bitstreams with respect to the respective objects; (b) decoding the control component; (c) scalable decoding of shape information and texture information and decoding object information in a tile, with respect to each tile component; (d) assembling the decoded object information items relative to the respective tile components using the decoded control coraponent in each object; and (e) assembling a number of object information items on a screen.
[29]
The method of claim 28, wherein, when the shape information is scalably encoded in step (c), the shape information in each tile to be encoded is divided in a case where the tile is included in the object, a case where the shape information is partially in the tile is present, and in a case where the shape information is not present in the tile, only texture information is scalable when the tile is included in the object, the shape information in the tile is scalable when the shape information is partially present in the tile and texture information is decoded using the scalable decoded texture information and shape information.
[30]
The method of claim 28, wherein when the shape information is scalably encoded in step (c), the number of Μ x N binary alpha blocks (BABs) to be decoded horizontally and vertically in a base layer is obtained by Equation 3 ( when M = N = 16), the number of BABs to be decoded in parent layers in the vertical and horizontal direction is obtained by the equation 4, and the size of the BABs to be decoded in the parent layers is obtained by the Equation 5 according to the size of the entered object.
[31]
The method of claim 28, wherein when the shape information is scalably decoded in step (c), the arithmetic decoding on the pixel to be decoded is performed by the probability values of the Probability distribution 1 according to the arrangement of the pixels around the to be decoded. pixel when the position of the pixel to be encoded is PI and the arithmetic decoding is performed on the pixels to be decoded by the probability values of the Probability distribution 2 according to the arrangement of the pixels around the pixel to be decoded when the positions of the pixels to be decoded P2 and P3 when the encoding mode is the ISL encoding mode and the arithmetic encoding on the pixel to be encoded is performed by the probability values of the Probability Distribution 3 according to the arrangement of the pixels around the pixel to be encoded when the encoding mode is the RSL encoding mode.
[32]
32. Method of scalable encoding and decoding of a still picture using wavelet transform, characterized in that the bitstream syntaxes shown in Figs. 18A to 18F are used for object-based still picture coding and decoding using a tile operation.
[33]
33. Scalable still image encoding apparatus using wavelet transformation, comprising: one or more tile dividers dividing an input object of any shape into tiles of uniform size and classifying control components; one or more control signal encoders for encoding control components classified by the tile distributors; a number of image coders for receiving tiles distributed by the tile distributors, wavelet transforming the shape information and texture information in the tiles, and scalable coding of the values of the respective layers; and a multiplexer serving to sequentially connect encoded bitstreams to the respective tiles.
[34]
34. A device for decoding a bitstream obtained by scalable encoding of a still image using wavelet transform, comprising: a demultiplexer for receiving the encoded bitstream, dividing the encoded bitstream into objects, and classifying a control component and a number of tile components in the bitstream with respect to each object; one or more control signal decoders for decoding the control component; a number of still picture decoders for receiving a tile component and scalable decoding of shape information and texture information in the tile; one or more tile assemblers for assembling the decoded tile component in each object; and an object constructor for assembling on a screen a number of object information items composed by the tile constructor.

类似技术:

公开号 | 公开日 | 专利标题

NL1013089C2|2000-12-19|Scalable encoding / decoding methods and still picture producing apparatus using wavelet transform.

Skodras et al.2001|JPEG2000: The upcoming still image compression standard

Taubman et al.2002|JPEG2000: Standard for interactive imaging

Christopoulos et al.2000|The JPEG2000 still image coding system: an overview

CN104982036B|2018-05-25|The method and computing device unpacked for frame packaging and frame

Skodras et al.2001|The jpeg 2000 still image compression standard

US20020018072A1|2002-02-14|Scalable graphics image drawings on multiresolution image with/without image data re-usage

CN101822068B|2012-05-30|Method and device for processing depth-map

US6674911B1|2004-01-06|N-dimensional data compression using set partitioning in hierarchical trees

JP3847828B2|2006-11-22|Progressive image transmission method using discrete wavelet transform and data file therefor

US7302105B2|2007-11-27|Moving image coding apparatus, moving image decoding apparatus, and methods therefor

JP3534465B2|2004-06-07|Subband coding method

KR100957473B1|2010-05-14|Method for real-time software video/audio compression, transmission, decompression and display

JPH1146372A|1999-02-16|Image coding and image decoding device

Wang et al.2006|Curved wavelet transform for image coding

KR20050013376A|2005-02-04|Inverse discrete wavelet transformer| capable of adaptively encoding still image based on energy by block and method theneof

Aulí-Llinàs et al.2013|Lossy-to-lossless 3D image coding through prior coefficient lookup tables

CN103703761A|2014-04-02|Method for generating, transmitting and receiving stereoscopic images, and related devices

KR100304667B1|2001-11-05|Scalable coding/decoding method and apparatus for still image using wavelet transformation

US20200304773A1|2020-09-24|Depth codec for 3d-video recording and streaming applications

US8331708B2|2012-12-11|Method and apparatus for a multidimensional discrete multiwavelet transform

KR100412176B1|2003-12-24|Document segmentation compression, reconstruction system and method

Thanapirom et al.2003|A zerotree stereo video encoder

KR100480751B1|2005-05-16|Digital video encoding/decoding method and apparatus

JP2001309378A|2001-11-02|Method and device for encoding and decoding moving video simultaneously having spatial hierarchical structure and picture hierarchical structure

同族专利:

公开号 | 公开日

NL1013089C2|2000-12-19|

FR2810492A1|2001-12-21|

FR2788401A1|2000-07-13|

FR2788401B1|2001-08-03|

US6501861B1|2002-12-31|

JP2000102013A|2000-04-07|

FR2810492B1|2005-04-29|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US5649032A|1994-11-14|1997-07-15|David Sarnoff Research Center, Inc.|System for automatically aligning images to form a mosaic image|

EP0790741B1|1995-10-27|2007-04-04|Texas Instruments Incorporated|Video compression method using sub-band decomposition|

US5692063A|1996-01-19|1997-11-25|Microsoft Corporation|Method and system for unrestricted motion estimation for video|

JP4157929B2|1996-09-30|2008-10-01|株式会社ハイニックスセミコンダクター|Video information encoding / decoding device|

US6005980A|1997-03-07|1999-12-21|General Instrument Corporation|Motion estimation and compensation of video object planes for interlaced digital video|

JP3213584B2|1997-09-19|2001-10-02|シャープ株式会社|Image encoding device and image decoding device|

GB2336058B|1998-04-02|2002-03-27|Daewoo Electronics Co Ltd|Apparatus and method for adaptively coding an image signal|

US6501861B1|1998-09-17|2002-12-31|Samsung Electronics Co., Ltd.|Scalable coding/decoding methods and apparatus for producing still image using wavelet transformation|JP3270008B2|1998-06-26|2002-04-02|沖電気工業株式会社|Shape information encoding / decoding device|

US6501861B1|1998-09-17|2002-12-31|Samsung Electronics Co., Ltd.|Scalable coding/decoding methods and apparatus for producing still image using wavelet transformation|

JP2000341689A|1999-05-27|2000-12-08|Sony Corp|Wavelet inverse converting device and its method and wavelet decoding device and its method|

EP1079625A3|1999-06-15|2003-03-12|Canon Research Centre France S.A.|Digital signal coding, decoding and transmission, with decomposition into frequency sub-bands and segmentation|

US6704454B1|1999-07-23|2004-03-09|Sarnoff Corporation|Method and apparatus for image processing by generating probability distribution of images|

FR2800957B1|1999-11-10|2002-05-17|Samsung Electronics Co Ltd|SCALABLE METHODS AND ENCODING / DECODING APPARATUS FOR STILL IMAGE CREATION USING WAVELET TRANSFORMATION|

JP2001169293A|1999-12-08|2001-06-22|Nec Corp|Picture transmission apparatus|

EP1113671A3|1999-12-28|2007-04-11|Matsushita Electric Industrial Co., Ltd.|Image decoding apparatus and image coding apparatus|

KR100353851B1|2000-07-07|2002-09-28|한국전자통신연구원|Water ring scan apparatus and method, video coding/decoding apparatus and method using that|

JP3406577B2|2000-07-19|2003-05-12|技研トラステム株式会社|Object recognition method|

US6718066B1|2000-08-14|2004-04-06|The Hong Kong University Of Science And Technology|Method and apparatus for coding an image object of arbitrary shape|

US7792390B2|2000-12-19|2010-09-07|Altera Corporation|Adaptive transforms|

US20030041257A1|2001-05-04|2003-02-27|Wee Susie J.|Systems, methods and storage devices for scalable data streaming|

US7738551B2|2002-03-18|2010-06-15|International Business Machines Corporation|System and method for processing a high definition televisionimage|

US7925392B2|2002-04-23|2011-04-12|Lord Corporation|Aircraft vehicular propulsion system monitoring device and method|

US7916782B2|2004-08-18|2011-03-29|Panasonic Corporation|Picture coding method, picture decoding method, picture coding apparatus, picture decoding apparatus, and program thereof|

WO2007032597A1|2005-07-20|2007-03-22|Electronics And Telecommunications Research Institute|Method of redundant picture coding using polyphase downsampling and the codec using the same|

KR100779173B1|2005-07-20|2007-11-26|한국전자통신연구원|Method of redundant picture coding using polyphase downsampling and the codec using the same|

US20080263012A1|2005-09-01|2008-10-23|Astragroup As|Post-Recording Data Analysis and Retrieval|

US8478074B2|2006-07-07|2013-07-02|Microsoft Corporation|Providing multiple and native representations of an image|

EP1983332B1|2007-04-18|2016-08-31|Horiba Jobin Yvon S.A.S.|A spectroscopic imaging method and system for exploring the surface of a sample|

US20130022114A1|2008-06-23|2013-01-24|Mediatek Inc.|Method and related apparatuses for decoding multimedia data|

JP5046047B2|2008-10-28|2012-10-10|セイコーインスツル株式会社|Image processing apparatus and image processing program|

法律状态:
2000-06-05| AD1A| A request for search or an international type search has been filed|

2000-12-01| RD2N| Patents in respect of which a decision has been taken or a report has been made (novelty report)|Effective date: 20001016 |

2001-02-01| PD2B| A search report has been drawn up|

2010-04-21| V1| Lapsed because of non-payment of the annual fee|Effective date: 20100401 |

优先权:

申请号 | 申请日 | 专利标题

KR19980038419|1998-09-17|

[返回顶部]